AITopics | chemical data

Collaborating Authors

chemical data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ChemLLM: A Chemical Large Language Model

Zhang, Di, Liu, Wei, Tan, Qian, Chen, Jingdan, Yan, Hang, Yan, Yuliang, Li, Jiatong, Huang, Weiran, Yue, Xiangyu, Zhou, Dongzhan, Zhang, Shufei, Su, Mao, Zhong, Hansen, Li, Yuqiang, Ouyang, Wanli

arXiv.org Artificial IntelligenceFeb-9-2024

Large language models (LLMs) have made impressive progress in chemistry applications, including molecular property prediction, molecular generation, experimental protocol design, etc. However, the community lacks a dialogue-based model specifically designed for chemistry. The challenge arises from the fact that most chemical data and scientific knowledge are primarily stored in structured databases, and the direct use of these structured data compromises the model's ability to maintain coherent dialogue. To tackle this issue, we develop a novel template-based instruction construction method that transforms structured knowledge into plain dialogue, making it suitable for language model training. By leveraging this approach, we develop ChemLLM, the first large language model dedicated to chemistry, capable of performing various tasks across chemical disciplines with smooth dialogue interaction. ChemLLM beats GPT-3.5 on all three principal tasks in chemistry, i.e., name conversion, molecular caption, and reaction prediction, and surpasses GPT-4 on two of them. Remarkably, ChemLLM also shows exceptional adaptability to related mathematical and physical tasks despite being trained mainly on chemical-centric corpora. Furthermore, ChemLLM demonstrates proficiency in specialized NLP tasks within chemistry, such as literature translation and cheminformatic programming. ChemLLM opens up a new avenue for exploration within chemical studies, while our method of integrating structured chemical knowledge into dialogue systems sets a new frontier for developing LLMs across various scientific fields. Codes, Datasets, and Model weights are publicly accessible at hf.co/AI4Chem/ChemLLM-7B-Chat.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2402.06852

Country:

Asia > China (0.46)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

nach0: Multimodal Natural and Chemical Languages Foundation Model

Livne, Micha, Miftahutdinov, Zulfat, Tutubalina, Elena, Kuznetsov, Maksim, Polykovskiy, Daniil, Brundyn, Annika, Jhunjhunwala, Aastha, Costa, Anthony, Aliper, Alex, Zhavoronkov, Alex

arXiv.org Artificial IntelligenceNov-21-2023

Large Language Models (LLMs) have substantially driven scientific progress in various domains, and many papers have demonstrated their ability to tackle complex problems with creative solutions. Our paper introduces a new foundation model, nach0, capable of solving various chemical and biological tasks: biomedical question answering, named entity recognition, molecular generation, molecular synthesis, attributes prediction, and others. nach0 is a multi-domain and multi-task encoder-decoder LLM pre-trained on unlabeled text from scientific literature, patents, and molecule strings to incorporate a range of chemical and linguistic knowledge. We employed instruction tuning, where specific task-related instructions are utilized to fine-tune nach0 for the final set of tasks. To train nach0 effectively, we leverage the NeMo framework, enabling efficient parallel optimization of both base and large model versions. Extensive experiments demonstrate that our model outperforms state-of-the-art baselines on single-domain and cross-domain tasks. Furthermore, it can generate high-quality outputs in molecular and textual formats, showcasing its effectiveness in multi-domain setups.

dataset, language model, molecule, (13 more...)

arXiv.org Artificial Intelligence

2311.1241

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.14)
(6 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Look out for potential bias in chemical data sets

#artificialintelligenceSep-11-2019, 09:13:19 GMT

There might be disadvantages to using tried and trusted methods.Credit: Science Photo Library Like most research fields, materials science has embraced'big data', including machine-learning models and techniques. These are being used to predict new materials and properties, and devise routes to existing drugs and chemicals. But machine learning requires training data, such as those on reagents, conditions and starting materials. These are usually gleaned from the literature, and are human-generated. The choice of reagents that researchers use could come, for example, from experience or from previously published work. It might be based on a recommendation passed from supervisor to graduate student, or simply on how easy reagents are to find or buy.

artificial intelligence, chemical data, machine learning, (7 more...)

#artificialintelligence

AI-Alerts: 2019 > 2019-09 > AAAI AI-Alert for Sep 17, 2019 (1.00)

Country: North America > United States (0.18)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

New AI approach bridges the 'slim-data gap' that can stymie deep learning approaches

#artificialintelligenceFeb-27-2019, 04:13:00 GMT

Scientists have developed a deep neural network that sidesteps a problem that has bedeviled efforts to apply artificial intelligence to tackle complex chemistry--a shortage of precisely labeled chemical data. The new method gives scientists an additional tool to apply deep learning to explore drug discovery, new materials for manufacturing, and a swath of other applications. Predicting chemical properties and reactions among millions upon millions of compounds is one of the most daunting tasks that scientists face. There is no source of complete information from which a deep learning program could draw upon. Usually, such a shortage of a vast amount of clean data is a show-stopper for a deep learning project.

artificial intelligence, chemistry, machine learning, (11 more...)

#artificialintelligence

Industry: Health & Medicine (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI more accurate than animal testing for spotting toxic chemicals

#artificialintelligenceAug-1-2018, 00:06:34 GMT

Most consumers would be dismayed with how little we know about the majority of chemicals. Only 3 percent of industrial chemicals – mostly drugs and pesticides – are comprehensively tested. Most of the 80,000 to 140,000 chemicals in consumer products have not been tested at all or just examined superficially to see what harm they may do locally, at the site of contact and at extremely high doses. I am a physician and former head of the European Center for the Validation of Alternative Methods of the European Commission (2002-2008), and I am dedicated to finding faster, cheaper and more accurate methods of testing the safety of chemicals. To that end, I now lead a new program at Johns Hopkins University to revamp the safety sciences.

animal test, artificial intelligence, substance, (17 more...)

#artificialintelligence

Country:

Europe (0.39)
North America > United States (0.31)

Industry:

Materials > Chemicals (1.00)
Government > Regional Government > North America Government > United States Government (0.31)

Technology: Information Technology > Artificial Intelligence (0.32)

Add feedback

Artificial intelligence helps with skin cancer detection

#artificialintelligenceAug-25-2017, 12:31:16 GMT

The technology has been devised at the University of Waterloo, together with a team from the Sunnybrook Research Institute. The focus is with the detection of detect melanoma skin cancer. The technology utilizes machine-learning software in order to analyze images of skin lesions. The analysis seeks to provide doctors with objective data on biological markers of melanoma. This is important since early detection of skin cancer has a high success in terms of starting treatment early, whereas late detection is far more serious.

artificial intelligence, artificial intelligence help, detection, (7 more...)

#artificialintelligence

Country: North America > Canada > Quebec > Montreal (0.07)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (1.00)
Health & Medicine > Therapeutic Area > Dermatology (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback